Dataset Title:Regional Human Capital Database (RHCD), Regional Elections Database (RED), and Regional Migration Database (RMD)Authors:Jane GingrichDan McArthurMihnea CuibusDate:December 18, 20251. OverviewThis Dataverse contains three harmonized subnational datasets:- Regional Human Capital Database (RHCD)- Regional Elections Database (RED)- Regional Migration Database (RMD)The datasets provide longitudinal data at the Local Administrative Unit (LAU) level or equivalentsubnational units for multiple countries, covering the 1980s through the 2020s. The data are designedfor comparative research on education, electoral behavior, and migration at fine geographic scales.A complete description of variables, sources, harmonization procedures, and country-specific notesis provided in the accompanying file:Codebook.pdfUsers should consult the codebook before analysis.2. File StructureThe Dataverse contains two zipped collections of files.2.1 Combined datasetsThese files contain all countries stacked together:- Elections_V1.dta- Elections_V1.csv
- RHCD_V1.dta- RHCD_V1.csv- Migration_V1.dta- Migration_V1.csv- Elections_V2.dta- Elections_V2.csv
- RHCD_V2.dta- RHCD_V2.csv- Migration_V2.dta- Migration_V2.csvThese files are intended for cross-national or pooled analyses. Observations are uniquely identifiedby combinations of country, time, geography, and version identifiers.2.2 Country-specific datasetsCountry-level files are organized in separate folders, one folder per country. Each folder may containup to three datasets:- RHCD_<country>.csv  (human capital data)- Elections<country>.csv   (election data)- Migration<country>. csv   (migration data, where available)These files are intended for users working on a single country or who require faster loading times.3. Geographic VersionsThe dataset is released in two geographic versions:- Version 1 (V1): LAU 2018 (or equivalent) boundaries- Version 2 (V2): LAU 2021/2024 boundaries (where available)The variable "Version" identifies the boundary system used in each observation.Because administrative boundaries change over time, some local units cannot be fully matched acrossall periods. These cases are documented in the codebook and flagged in the data where applicable.4. Identifiers and MergingKey identifiers include:- Ccode: country identifier- Year or Decade: time identifier- LAU: local administrative unit code- Laumatch: harmonized geographic identifier used to match RHCD and RED across timeBecause census years, election years, and geographic boundaries do not always align, users are stronglyadvised to merge datasets using Laumatch, Ccode, and Decade rather than raw LAU codes.5. Temporal CoverageCensus, election, and migration data are not always available in the same years. Decade-levelaggregation is provided to facilitate matching across datasets.Coverage varies by country and period. Some countries or decades contain missing or partially matchedobservations. These limitations are documented in the country-specific notes in the codebook.6. Known Limitations- Some small or merged local units cannot be consistently matched across time- Geographic precision varies across countries and datasets- Migration data are not available for all countries or all periods- Some election results are imputed or aggregated due to data availability constraintsUsers should consult the country notes in the codebook before conducting analysis.7. CitationPlease cite this dataset as:Gingrich, Jane; McArthur, Dan; Cuibus, Mihnea. 2025.Regional Human Capital Database (RHCD). Harvard Dataverse.

Dataset paper forthcoming in the British Journal of Political Science. Users should also cite the original national statistical and electoral sources listed in the codebookwhen using country-specific data.8. ContactFor questions, corrections, or updates, please contact the authors,